A Latent Space Approach to Dynamic Embedding of Co-occurrence Data

نویسندگان

Purnamrita Sarkar

Sajid M. Siddiqi

Geoffrey J. Gordon

چکیده

We consider dynamic co-occurrence data, such as author-word links in papers published in successive years of the same conference. For static co-occurrence data, researchers often seek an embedding of the entities (authors and words) into a lowdimensional Euclidean space. We generalize a recent static co-occurrence model, the CODE model of Globerson et al. (2004), to the dynamic setting: we seek coordinates for each entity at each time step. The coordinates can change with time to explain new observations, but since large changes are improbable, we can exploit data at previous and subsequent steps to find a better explanation for current observations. To make inference tractable, we show how to approximate our observation model with a Gaussian distribution, allowing the use of a Kalman filter for tractable inference. The result is the first algorithm for dynamic embedding of co-occurrence data which provides distributional information for its coordinate estimates. We demonstrate our model both on synthetic data and on author-word data from the NIPS corpus, showing that it produces intuitively reasonable embeddings. We also provide evidence for the usefulness of our model by its performance on an authorprediction task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Kalman Filters for Embedding Author-Word Co-occurrence Data over Time

We address the problem of embedding entities into Euclidean space over time based on co-occurrence data. We extend the CODE model of Globerson et al. (2004) to a dynamic setting. This leads to a non-standard factored state space model with real-valued hidden parent nodes and discrete observation nodes. We investigate the use of variational approximations applied to the observation model that al...

متن کامل

Hierarchical Bayesian Embeddings for Analysis and Synthesis of High-Dimensional Dynamic Data

High-dimensional, time-dependent data are analyzed by developing a dynamic model in an associated low-dimensional embedding space. The proposed approach employs hierarchical Bayesian methods to learn a reversible statistical embedding, allowing one to (i) estimate the latent-space dimension from a set of training data, (ii) discard the training data when embedding new data, and (iii) synthesize...

متن کامل

Consistent Alignment of Word Embedding Models

Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as clustering similar words and inferring learning relationships, many challenges and open research questions remain. In this paper, we propose a solution that...

متن کامل

Latent Topic Embedding

Topic modeling and word embedding are two important techniques for deriving latent semantics from data. General-purpose topic models typically work in coarse granularity by capturing word co-occurrence at the document/sentence level. In contrast, word embedding models usually work in fine granularity by modeling word co-occurrence within small sliding windows. With the aim of deriving latent se...

متن کامل

Monocular 3D Human Motion Tracking Using Dynamic Probabilistic Latent Semantic Analysis

We propose a new statistical approach to human motion modeling and tracking that utilizes probabilistic latent semantic (PLSA) models to describe the mapping of image features to 3D human pose estimates. PLSA has been successfully used to model the co-occurrence of dyadic data on problems such as image annotation where image features are mapped to word categories via latent variable semantics. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

A Latent Space Approach to Dynamic Embedding of Co-occurrence Data

نویسندگان

چکیده

منابع مشابه

Approximate Kalman Filters for Embedding Author-Word Co-occurrence Data over Time

Hierarchical Bayesian Embeddings for Analysis and Synthesis of High-Dimensional Dynamic Data

Consistent Alignment of Word Embedding Models

Latent Topic Embedding

Monocular 3D Human Motion Tracking Using Dynamic Probabilistic Latent Semantic Analysis

عنوان ژورنال:

اشتراک گذاری